Image classification for Caltech101 | Airplanes, Motorbikes & Schooners dastaset!¶

merged_project_fig.png

Challenge¶

Multiclass image classification for Caltech101 | Airplanes, Motorbikes & Schooners dastaset using CNN and tensorflow 2.0

Dataset¶

Dataset contains following 3 classes of non-uniform 2D images with different resolutions:

  1. Airplanes : 800 images
  2. Motorbikes : 798 images
  3. Schooner : 63 images

This blog post covers the following aspects of multiclass image classification using 2D CNN¶

1. Building a 2D CNN based image classifier for multi class labels using Tensorflow 2.0¶

  • Data visualization, pre-possessing and data loading
  • Create model, training , performance evaluation and training curves

2. Experiment: Effect of depth on model performance¶

  • Create models with varying depths
  • Plot number of layers vs accuracy

3. Experiment: Effect of drop out fraction on model performance¶

  • Create models with varying amount of drop out in the final layer
  • Plot drop out fraction vs accuracy

4. Observations:¶

  • Best validation performance is observed at number of layers =3(2-conv2D layers, 1-Dense layer)
  • Overfitting and reduced accuracy for both training and validation data is observed when number of layers is greater than 3. This is likely due to overfitting in atleast one class
  • Using any amount of dropout in penultimate layer has resulted in decreased performance.

5. Building model with optimal number of layers and drop out fraction:¶

  • Create model with 3 layers(2 conv2D and 1 dense layer) and 0.1 dropout
  • Evaluate performance by creating classification report
In [1]:
import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import tensorflow_datasets as tfds
import pathlib
import matplotlib.pyplot as plt
from tensorflow.keras import layers
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Rescaling, Dropout
from tensorflow.keras.models import Sequential
seed_value = 332
tf.keras.backend.clear_session()
from sklearn.metrics import classification_report, balanced_accuracy_score
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
In [2]:
# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)
# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)
# 4. Set `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)

Data visualization, pre-possessing and data loading¶

In [3]:
#Note: Replace 'str_dir' with your path to data directory
str_dir = 'D:\Fall 2022\Data Mining_CSE 5334\Assignments\Assignment1\Data\caltech101_classification'
data_dir = pathlib.Path(str_dir)

image_count = len(list(data_dir.glob('*/*.jpg')))
print(f'Total number of images in dataset: {image_count}')
Total number of images in dataset: 1661
In [4]:
# visualizing sample from class: Airplanes
airplanes = list(data_dir.glob('airplanes/*'))
airplane_sample = PIL.Image.open(str(airplanes[2]))
airplane_sample
Out[4]:
In [5]:
# visualizing sample from class: Motorbikes
motorbikes = list(data_dir.glob('Motorbikes/*'))
morobike_sample = PIL.Image.open(str(motorbikes[0]))
morobike_sample
Out[5]:
In [6]:
# visualizing sample from class: Schooners
schooners = list(data_dir.glob('schooner/*'))
schooner_sample = PIL.Image.open(str(schooners[2]))
schooner_sample
Out[6]:
In [7]:
#Data partitioning
batch_size = 32
img_height = 224
img_width = 224

train_ds = tf.keras.utils.image_dataset_from_directory(
              data_dir,
              validation_split=0.2,
              subset="training",
              seed=seed_value,
              image_size=(img_height, img_width),
              batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
              data_dir,
              validation_split=0.2,
              subset="validation",
              seed=seed_value,
              image_size=(img_height, img_width),
              batch_size=batch_size)

class_names = ['Motorbikes','airplanes','schooner']
Found 1661 files belonging to 3 classes.
Using 1329 files for training.
Found 1661 files belonging to 3 classes.
Using 332 files for validation.
In [8]:
#visualizing samples from each class
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")
In [9]:
#prefetching the data for efficient dataloading
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

Create model, training , performance evaluation and training curves¶

In [10]:
#Here I am using keras sequential API. Later in this blog post, I will use keras functional API to build the model.
num_classes = len(class_names)

model = Sequential([
  layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy','sparse_categorical_accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 56, 56, 64)        18496     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 28, 28, 64)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 50176)             0         
                                                                 
 dense (Dense)               (None, 128)               6422656   
                                                                 
 dense_1 (Dense)             (None, 3)                 387       
                                                                 
=================================================================
Total params: 6,446,627
Trainable params: 6,446,627
Non-trainable params: 0
_________________________________________________________________
In [11]:
#training
epochs =3
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)
Epoch 1/3
42/42 [==============================] - 23s 542ms/step - loss: 0.8960 - accuracy: 0.7126 - sparse_categorical_accuracy: 0.7126 - val_loss: 0.1973 - val_accuracy: 0.9247 - val_sparse_categorical_accuracy: 0.9247
Epoch 2/3
42/42 [==============================] - 22s 527ms/step - loss: 0.0849 - accuracy: 0.9759 - sparse_categorical_accuracy: 0.9759 - val_loss: 0.1486 - val_accuracy: 0.9428 - val_sparse_categorical_accuracy: 0.9428
Epoch 3/3
42/42 [==============================] - 23s 560ms/step - loss: 0.0393 - accuracy: 0.9880 - sparse_categorical_accuracy: 0.9880 - val_loss: 0.1412 - val_accuracy: 0.9578 - val_sparse_categorical_accuracy: 0.9578
In [12]:
#visualizing the training results
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = [1,2,3]

plt.figure(figsize=(8, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
In [13]:
# compute classification metrics , precision, recall, F1 score on validation data

y_pred = model.predict(val_ds)
score = tf.nn.softmax(y_pred)
predicted_labels = np.argmax(score,axis =1)
ls_val_labels =[]
for img,label in val_ds:
    ls_val_labels.append(label)

correct_labels = tf.concat([item for item in ls_val_labels], axis = 0)    

print(classification_report(correct_labels, predicted_labels, target_names=class_names))
11/11 [==============================] - 2s 121ms/step
              precision    recall  f1-score   support

  Motorbikes       0.92      1.00      0.96       146
   airplanes       0.99      0.92      0.96       167
    schooner       1.00      0.95      0.97        19

    accuracy                           0.96       332
   macro avg       0.97      0.96      0.96       332
weighted avg       0.96      0.96      0.96       332

Experiment: Effect of depth on model performance¶

Write a function to build 2D CNN model with given number of layers and dropout fraction. Note that here I am using keras functional API to build the 2D CNN model.

In [14]:
def make_model(num_layers, dropout_frac = 0):
    layer_filters = [16, 32, 64,128,256]
    num_classes = len(class_names)
    inputs = Input(shape=(img_height, img_width, 3))
    x = inputs
    x = Rescaling(1./255, )(x)
    for num in range(num_layers-1):
        x = Conv2D(layer_filters[num], 3, padding='same', activation='relu')(x)
        x = MaxPooling2D()(x)
    x = Flatten()(x)
    x = Dense(layer_filters[num+1], activation = 'relu')(x)
    x = Dropout(dropout_frac)(x)
    x = Dense(num_classes)(x)
    
    model = Model(inputs = inputs,outputs = x)
    
    
    model.summary()
    model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
    return model

Create models with varying depth and plot the effect.

In [15]:
ls_num_layers = [2,3,4,5]
ls_val_accuracy =[]
ls_train_accuracy =[]

for i in ls_num_layers:
    tf.keras.backend.clear_session()
    m = make_model(i)
    
    epochs =3
    print(f'---------------------Training model with {i} layers------------------------')
    history = m.fit(train_ds, validation_data=val_ds, epochs=epochs)
    # compute classification metrics , precision, recall, F1 score on validation dataset

    y_pred = m.predict(val_ds)
    score = tf.nn.softmax(y_pred)
    y_pred_train = m.predict(train_ds)
    score_train = tf.nn.softmax(y_pred_train)
    
    val_predicted_labels = np.argmax(score,axis =1)
    train_predicted_labels = np.argmax(score_train, axis=1)
    
    ls_val_labels =[]
    for img,label in val_ds:
        ls_val_labels.append(label)
        
    ls_train_labels =[]
    for img,label in train_ds:
        ls_train_labels.append(label)

    correct_labels_val = tf.concat([item for item in ls_val_labels], axis = 0)  
    correct_labels_train = tf.concat([item for item in ls_train_labels], axis = 0)
    validation_accuracy = balanced_accuracy_score(correct_labels_val,val_predicted_labels)
    train_accuracy = balanced_accuracy_score(correct_labels_train,train_predicted_labels)
   
    ls_val_accuracy.append(validation_accuracy)
    ls_train_accuracy.append(train_accuracy)
    
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 flatten (Flatten)           (None, 200704)            0         
                                                                 
 dense (Dense)               (None, 32)                6422560   
                                                                 
 dropout (Dropout)           (None, 32)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 99        
                                                                 
=================================================================
Total params: 6,423,107
Trainable params: 6,423,107
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 2 layers------------------------
Epoch 1/3
42/42 [==============================] - 10s 238ms/step - loss: 2.0677 - accuracy: 0.7788 - val_loss: 0.2086 - val_accuracy: 0.9217
Epoch 2/3
42/42 [==============================] - 10s 233ms/step - loss: 0.0957 - accuracy: 0.9684 - val_loss: 0.1131 - val_accuracy: 0.9729
Epoch 3/3
42/42 [==============================] - 10s 235ms/step - loss: 0.0274 - accuracy: 0.9917 - val_loss: 0.0578 - val_accuracy: 0.9789
11/11 [==============================] - 1s 73ms/step
42/42 [==============================] - 3s 78ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 3 layers------------------------
Epoch 1/3
42/42 [==============================] - 17s 406ms/step - loss: 2.0447 - accuracy: 0.7833 - val_loss: 0.2179 - val_accuracy: 0.9277
Epoch 2/3
42/42 [==============================] - 17s 407ms/step - loss: 0.0760 - accuracy: 0.9759 - val_loss: 0.0596 - val_accuracy: 0.9880
Epoch 3/3
42/42 [==============================] - 17s 409ms/step - loss: 0.0210 - accuracy: 0.9962 - val_loss: 0.0598 - val_accuracy: 0.9819
11/11 [==============================] - 1s 104ms/step
42/42 [==============================] - 5s 108ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 56, 56, 64)        18496     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 28, 28, 64)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 50176)             0         
                                                                 
 dense (Dense)               (None, 128)               6422656   
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 3)                 387       
                                                                 
=================================================================
Total params: 6,446,627
Trainable params: 6,446,627
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 4 layers------------------------
Epoch 1/3
42/42 [==============================] - 22s 520ms/step - loss: 0.4726 - accuracy: 0.8405 - val_loss: 0.0914 - val_accuracy: 0.9759
Epoch 2/3
42/42 [==============================] - 22s 530ms/step - loss: 0.0330 - accuracy: 0.9925 - val_loss: 0.1635 - val_accuracy: 0.9488
Epoch 3/3
42/42 [==============================] - 21s 510ms/step - loss: 0.0288 - accuracy: 0.9902 - val_loss: 0.0229 - val_accuracy: 0.9880
11/11 [==============================] - 1s 125ms/step
42/42 [==============================] - 6s 140ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 56, 56, 64)        18496     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 28, 28, 64)       0         
 2D)                                                             
                                                                 
 conv2d_3 (Conv2D)           (None, 28, 28, 128)       73856     
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 14, 14, 128)      0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 dense (Dense)               (None, 256)               6422784   
                                                                 
 dropout (Dropout)           (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 3)                 771       
                                                                 
=================================================================
Total params: 6,520,995
Trainable params: 6,520,995
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 5 layers------------------------
Epoch 1/3
42/42 [==============================] - 29s 668ms/step - loss: 0.4059 - accuracy: 0.8330 - val_loss: 0.1414 - val_accuracy: 0.9518
Epoch 2/3
42/42 [==============================] - 26s 627ms/step - loss: 0.0678 - accuracy: 0.9797 - val_loss: 0.0715 - val_accuracy: 0.9789
Epoch 3/3
42/42 [==============================] - 25s 594ms/step - loss: 0.0300 - accuracy: 0.9910 - val_loss: 0.1435 - val_accuracy: 0.9548
11/11 [==============================] - 2s 156ms/step
42/42 [==============================] - 6s 153ms/step
In [16]:
# Barplot --  effect of # of CNN layers VS Accuracy    
layers = ['2', '3', '4', '5']
x = np.arange(len(layers)) 
width = 0.25  
fig, ax = plt.subplots()
fig.set_figheight(4)
fig.set_figwidth(8)
rects1 = ax.bar(x - width/2, np.round(ls_val_accuracy,4), width, label='Validation')
rects2 = ax.bar(x + width/2, np.round(ls_train_accuracy,4), width, label='Train')
ax.set_ylabel('Accuracy')
ax.set_xlabel('Number of layers in CNN model')
ax.set_title('Number of CNN layers vs  Accuracy')
ax.set_xticks(x, layers)
ax.set_ylim([0.93,1.008])
ax.legend()
ax.bar_label(rects1, padding=3)
ax.bar_label(rects2, padding=3)
fig.tight_layout()
plt.show()

Experiment: Effect of drop out fraction on model performance¶

In [17]:
num_layers = 3
ls_dropout_frac = [0.1,0.2,0.3,0.4,0.5,0.6,0.7]
ls_val_accuracy =[]
ls_train_accuracy =[]

for i in ls_dropout_frac:
    tf.keras.backend.clear_session()
    m = make_model(num_layers,i)
    
    epochs =3
    print(f'---------------------Training model with {i} drop out fraction ------------------------')
    history = m.fit(train_ds, validation_data=val_ds, epochs=epochs)
    # compute classification metrics , precision, recall, F1 score on validation dataset

    y_pred = m.predict(val_ds)
    score = tf.nn.softmax(y_pred)
    y_pred_train = m.predict(train_ds)
    score_train = tf.nn.softmax(y_pred_train)
    
    val_predicted_labels = np.argmax(score,axis =1)
    train_predicted_labels = np.argmax(score_train, axis=1)
    
    ls_val_labels =[]
    for img,label in val_ds:
        ls_val_labels.append(label)
        
    ls_train_labels =[]
    for img,label in train_ds:
        ls_train_labels.append(label)

    correct_labels_val = tf.concat([item for item in ls_val_labels], axis = 0)  
    correct_labels_train = tf.concat([item for item in ls_train_labels], axis = 0)
    validation_accuracy = balanced_accuracy_score(correct_labels_val,val_predicted_labels)
    train_accuracy = balanced_accuracy_score(correct_labels_train,train_predicted_labels)
   
    ls_val_accuracy.append(validation_accuracy)
    ls_train_accuracy.append(train_accuracy)
    
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.1 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 23s 528ms/step - loss: 0.6212 - accuracy: 0.8187 - val_loss: 0.1763 - val_accuracy: 0.9398
Epoch 2/3
42/42 [==============================] - 22s 515ms/step - loss: 0.0741 - accuracy: 0.9782 - val_loss: 0.0702 - val_accuracy: 0.9759
Epoch 3/3
42/42 [==============================] - 18s 440ms/step - loss: 0.0182 - accuracy: 0.9962 - val_loss: 0.0225 - val_accuracy: 0.9880
11/11 [==============================] - 1s 116ms/step
42/42 [==============================] - 5s 120ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.2 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 18s 413ms/step - loss: 0.9059 - accuracy: 0.8277 - val_loss: 0.1255 - val_accuracy: 0.9639
Epoch 2/3
42/42 [==============================] - 17s 400ms/step - loss: 0.0680 - accuracy: 0.9812 - val_loss: 0.0649 - val_accuracy: 0.9759
Epoch 3/3
42/42 [==============================] - 17s 408ms/step - loss: 0.0274 - accuracy: 0.9932 - val_loss: 0.0597 - val_accuracy: 0.9759
11/11 [==============================] - 1s 111ms/step
42/42 [==============================] - 5s 112ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.3 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 18s 409ms/step - loss: 1.1329 - accuracy: 0.8209 - val_loss: 0.1679 - val_accuracy: 0.9488
Epoch 2/3
42/42 [==============================] - 17s 402ms/step - loss: 0.0930 - accuracy: 0.9729 - val_loss: 0.1178 - val_accuracy: 0.9729
Epoch 3/3
42/42 [==============================] - 17s 404ms/step - loss: 0.0445 - accuracy: 0.9895 - val_loss: 0.0737 - val_accuracy: 0.9729
11/11 [==============================] - 1s 107ms/step
42/42 [==============================] - 5s 117ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.4 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 18s 413ms/step - loss: 0.7376 - accuracy: 0.8420 - val_loss: 0.1725 - val_accuracy: 0.9398
Epoch 2/3
42/42 [==============================] - 17s 414ms/step - loss: 0.0921 - accuracy: 0.9729 - val_loss: 0.0740 - val_accuracy: 0.9789
Epoch 3/3
42/42 [==============================] - 18s 425ms/step - loss: 0.0466 - accuracy: 0.9857 - val_loss: 0.0442 - val_accuracy: 0.9849
11/11 [==============================] - 1s 109ms/step
42/42 [==============================] - 5s 116ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.5 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 19s 442ms/step - loss: 1.5438 - accuracy: 0.7381 - val_loss: 0.2373 - val_accuracy: 0.9307
Epoch 2/3
42/42 [==============================] - 17s 410ms/step - loss: 0.1430 - accuracy: 0.9586 - val_loss: 0.1224 - val_accuracy: 0.9518
Epoch 3/3
42/42 [==============================] - 18s 419ms/step - loss: 0.0654 - accuracy: 0.9767 - val_loss: 0.0743 - val_accuracy: 0.9669
11/11 [==============================] - 1s 105ms/step
42/42 [==============================] - 5s 112ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.6 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 18s 430ms/step - loss: 1.1325 - accuracy: 0.7570 - val_loss: 0.2676 - val_accuracy: 0.9066
Epoch 2/3
42/42 [==============================] - 22s 518ms/step - loss: 0.1932 - accuracy: 0.9533 - val_loss: 0.1072 - val_accuracy: 0.9699
Epoch 3/3
42/42 [==============================] - 18s 433ms/step - loss: 0.1228 - accuracy: 0.9631 - val_loss: 0.1223 - val_accuracy: 0.9488
11/11 [==============================] - 1s 106ms/step
42/42 [==============================] - 5s 113ms/step
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 0.7 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 19s 444ms/step - loss: 1.7336 - accuracy: 0.5847 - val_loss: 0.4060 - val_accuracy: 0.8765
Epoch 2/3
42/42 [==============================] - 19s 445ms/step - loss: 0.4270 - accuracy: 0.8269 - val_loss: 0.1878 - val_accuracy: 0.9157
Epoch 3/3
42/42 [==============================] - 18s 435ms/step - loss: 0.2824 - accuracy: 0.8789 - val_loss: 0.1541 - val_accuracy: 0.9187
11/11 [==============================] - 1s 107ms/step
42/42 [==============================] - 5s 115ms/step
In [18]:
# Barplot --  effect of # of CNN layers including dense layers VS Accuracy    
dropout_frac = ['0.1','0.2','0.3','0.4','0.5','0.6','0.7']
x = np.arange(len(dropout_frac)) 
width = 0.25  
fig, ax = plt.subplots()
fig.set_figheight(4)
fig.set_figwidth(8)
rects1 = ax.bar(x - width/2, np.round(ls_val_accuracy,3), width, label='Validation')
rects2 = ax.bar(x + width/2, np.round(ls_train_accuracy,3), width, label='Train')
ax.set_ylabel('Accuracy')
ax.set_xlabel('Dropout fraction')
ax.set_title('Dropout fraction vs  Accuracy')
ax.set_xticks(x, dropout_frac)
ax.set_ylim([0.80,1.008])
ax.legend(fontsize = 8)
ax.bar_label(rects1, padding=3, fontsize = 6)
ax.bar_label(rects2, padding=3,fontsize = 6)
fig.tight_layout()
plt.show()

Building model with optimal number of layers and drop out fraction¶

Results:¶

Accuracy for 3 classes is observed to be 98%. Precision, recall and f1-score for each class of images is as follows.

Precision Recall F1-score
Motorbikes 0.97 1.00 0.98
Airplanes 1.00 0.97 0.98
Schooners 0.95 0.95 0.95
In [19]:
num_layers = 3
dropout_frac = 0.1

tf.keras.backend.clear_session()
m = make_model(num_layers,dropout_frac)

epochs =3
print(f'---------------------Training model with {num_layers} layers and {dropout_frac} drop out fraction ------------------------')
history = m.fit(train_ds, validation_data=val_ds, epochs=epochs)
# compute classification metrics , precision, recall, F1 score on validation dataset

y_pred = m.predict(val_ds)
score = tf.nn.softmax(y_pred)
predicted_labels = np.argmax(score,axis =1)
ls_val_labels =[]
for img,label in val_ds:
    ls_val_labels.append(label)

correct_labels = tf.concat([item for item in ls_val_labels], axis = 0)    
train_acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
    
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 rescaling (Rescaling)       (None, 224, 224, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 224, 224, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 16)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 32)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 64)                6422592   
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 3)                 195       
                                                                 
=================================================================
Total params: 6,427,875
Trainable params: 6,427,875
Non-trainable params: 0
_________________________________________________________________
---------------------Training model with 3 layers and 0.1 drop out fraction ------------------------
Epoch 1/3
42/42 [==============================] - 19s 441ms/step - loss: 1.5554 - accuracy: 0.8141 - val_loss: 0.1221 - val_accuracy: 0.9578
Epoch 2/3
42/42 [==============================] - 18s 423ms/step - loss: 0.0637 - accuracy: 0.9812 - val_loss: 0.1064 - val_accuracy: 0.9759
Epoch 3/3
42/42 [==============================] - 18s 422ms/step - loss: 0.0186 - accuracy: 0.9955 - val_loss: 0.0506 - val_accuracy: 0.9819
11/11 [==============================] - 1s 120ms/step
In [20]:
#visualizing the training results
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 3))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
In [21]:
# compute classification metrics , precision, recall, F1 score on validation data

y_pred = m.predict(val_ds)
score = tf.nn.softmax(y_pred)
predicted_labels = np.argmax(score,axis =1)
ls_val_labels =[]
for img,label in val_ds:
    ls_val_labels.append(label)

correct_labels = tf.concat([item for item in ls_val_labels], axis = 0)    

print(classification_report(correct_labels, predicted_labels, target_names=class_names))
11/11 [==============================] - 1s 113ms/step
              precision    recall  f1-score   support

  Motorbikes       0.97      1.00      0.98       146
   airplanes       1.00      0.97      0.98       167
    schooner       0.95      0.95      0.95        19

    accuracy                           0.98       332
   macro avg       0.97      0.97      0.97       332
weighted avg       0.98      0.98      0.98       332